Cluster trials in implementation research: estimation of intracluster correlation coefficients and sample size

Cluster randomized trials (cRCT) to assess vaccine effectiveness incorporate indirect effects of vaccination, helping to inform vaccination policy. To calculate the sample size for a cRCT, an estimate of the intracluster correlation coefficient (ICC) is required. For infectious diseases, shared characteristics and social mixing behaviours may increase susceptibility and exposure, promote transmission and be a source of clustering. We present ICCs from a school-based cRCT assessing the effectiveness of a meningococcal B vaccine (Bexsero, GlaxoSmithKline) on reducing oropharyngeal carriage of Neisseria meningitidis (Nm) in 34,489 adolescents from 237 schools in South Australia in 2017/2018. We also explore the contribution of shared behaviours and characteristics to these ICCs. The ICC for carriage of disease-causing Nm genogroups (primary outcome) pre-vaccination was 0.004 (95% CI: 0.002, 0.007) and for all Nm was 0.007 (95%CI: 0.004, 0.011). Adjustment for social behaviours and personal characteristics reduced the ICC for carriage of disease-causing and all Nm genogroups by 25% (to 0.003) and 43% (to 0.004), respectively. ICCs are also reported for risk factors here, which may be outcomes in future research. Higher ICCs were observed for susceptibility and/or exposure variables related to Nm carriage (having a cold, spending ≥1 night out socializing or kissing ≥1 person in the previous week). In metropolitan areas, nights out socializing was a highly correlated behaviour. By contrast, smoking was a highly correlated behaviour in rural areas. A practical example to inform future cRCT sample size estimates is provided.

Download Full-text

Intracluster correlation coefficients for sample size calculations related to cardiovascular disease prevention and management in primary care practices

BMC Research Notes ◽

10.1186/s13104-015-1042-y ◽

2015 ◽

Vol 8 (1) ◽

pp. 89 ◽

Cited By ~ 16

Author(s):

Jatinderpreet Singh ◽

Clare Liddy ◽

William Hogg ◽

Monica Taljaard

Keyword(s):

Cardiovascular Disease ◽

Primary Care ◽

Disease Prevention ◽

Sample Size ◽

Correlation Coefficients ◽

Cardiovascular Disease Prevention ◽

Intracluster Correlation ◽

Sample Size Calculations ◽

Care Practices ◽

Primary Care Practices

Download Full-text

Intracluster correlation coefficients from the 2005 WHO Global Survey on Maternal and Perinatal Health: implications for implementation research

Paediatric and Perinatal Epidemiology ◽

10.1111/j.1365-3016.2007.00901.x ◽

2008 ◽

Vol 22 (2) ◽

pp. 117-125 ◽

Cited By ~ 33

Author(s):

Monica Taljaard ◽

Allan Donner ◽

José Villar ◽

Daniel Wojdyla ◽

Alejandro Velazco ◽

...

Keyword(s):

Implementation Research ◽

Correlation Coefficients ◽

Perinatal Health ◽

Intracluster Correlation ◽

Health Implications ◽

Global Survey

Download Full-text

Intra-cluster correlations from the CLustered OUtcome Dataset bank to inform the design of longitudinal cluster trials

Clinical Trials ◽

10.1177/17407745211020852 ◽

2021 ◽

pp. 174077452110208

Author(s):

Elizabeth Korevaar ◽

Jessica Kasza ◽

Monica Taljaard ◽

Karla Hemming ◽

Terry Haines ◽

...

Keyword(s):

Sample Size ◽

Discrete Time ◽

Correlation Coefficients ◽

Time Decay ◽

Randomised Trials ◽

Cluster Randomised Trials ◽

Cluster Randomised ◽

Sample Size Calculations ◽

Correlation Structures ◽

Cluster Correlation

Background: Sample size calculations for longitudinal cluster randomised trials, such as crossover and stepped-wedge trials, require estimates of the assumed correlation structure. This includes both within-period intra-cluster correlations, which importantly differ from conventional intra-cluster correlations by their dependence on period, and also cluster autocorrelation coefficients to model correlation decay. There are limited resources to inform these estimates. In this article, we provide a repository of correlation estimates from a bank of real-world clustered datasets. These are provided under several assumed correlation structures, namely exchangeable, block-exchangeable and discrete-time decay correlation structures. Methods: Longitudinal studies with clustered outcomes were collected to form the CLustered OUtcome Dataset bank. Forty-four available continuous outcomes from 29 datasets were obtained and analysed using each correlation structure. Patterns of within-period intra-cluster correlation coefficient and cluster autocorrelation coefficients were explored by study characteristics. Results: The median within-period intra-cluster correlation coefficient for the discrete-time decay model was 0.05 (interquartile range: 0.02–0.09) with a median cluster autocorrelation of 0.73 (interquartile range: 0.19–0.91). The within-period intra-cluster correlation coefficients were similar for the exchangeable, block-exchangeable and discrete-time decay correlation structures. Within-period intra-cluster correlation coefficients and cluster autocorrelations were found to vary with the number of participants per cluster-period, the period-length, type of cluster (primary care, secondary care, community or school) and country income status (high-income country or low- and middle-income country). The within-period intra-cluster correlation coefficients tended to decrease with increasing period-length and slightly decrease with increasing cluster-period sizes, while the cluster autocorrelations tended to move closer to 1 with increasing cluster-period size. Using the CLustered OUtcome Dataset bank, an RShiny app has been developed for determining plausible values of correlation coefficients for use in sample size calculations. Discussion: This study provides a repository of intra-cluster correlations and cluster autocorrelations for longitudinal cluster trials. This can help inform sample size calculations for future longitudinal cluster randomised trials.

Download Full-text

Relationship amongst ResearchGate altmetric indicators and Scopus bibliometric indicators

New Library World ◽

10.1108/nlw-03-2015-0017 ◽

2015 ◽

Vol 116 (9/10) ◽

pp. 564-577 ◽

Cited By ~ 10

Author(s):

RISHABH SHRIVASTAVA ◽

Preeti Mahajan

Keyword(s):

Sample Size ◽

Correlation Coefficients ◽

Strong Positive Correlation ◽

Bibliometric Indicators ◽

Pearson’S Correlation ◽

Content Type ◽

Positive Correlation ◽

Pearson's Correlation ◽

Different Characteristics ◽

The Relationship

Purpose – The purpose of this paper is twofold. First, the study aims to investigate the relationship between the altmetric indicators from ResearchGate (RG) and the bibliometric indicators from the Scopus database. Second, the study seeks to examine the relationship amongst the RG altmetric indicators themselves. RG is a rich source of altmetric indicators such as Citations, RGScore, Impact Points, Profile Views, Publication Views, etc. Design/methodology/approach – For establishing whether RG metrics showed the same results as the established sources of metrics, Pearson’s correlation coefficients were calculated between the metrics provided by RG and the metrics obtained from Scopus. Pearson’s correlation coefficients were also calculated for the metrics provided by RG. The data were collected by visiting the profile pages of all the members who had an account in RG under the Department of Physics, Panjab University, Chandigarh (India). Findings – The study showed that most of the RG metrics showed strong positive correlation with the Scopus metrics, except for RGScore (RG) and Citations (Scopus), which showed moderate positive correlation. It was also found that the RG metrics showed moderate to strong positive correlation amongst each other. Research limitations/implications – The limitation of this study is that more and more scientists and researchers may join RG in the future, therefore the data may change. The study focuses on the members who had an account in RG under the Department of Physics, Panjab University, Chandigarh (India). Perhaps further studies can be conducted by increasing the sample size and by taking a different sample size having different characteristics. Originality/value – Being an emerging field, not much has been conducted in the area of altmetrics. Very few studies have been conducted on the reach of academic social networks like RG and their validity as sources of altmetric indicators like RGScore, Impact Points, etc. The findings offer insights to the question whether RG can be used as an alternative to traditional sources of bibliometric indicators, especially with reference to a rapidly developing country such as India.

Download Full-text

Intracluster correlation coefficients for the Brazilian Multicenter Study on Preterm Birth (EMIP): methodological and practical implications

BMC Medical Research Methodology ◽

10.1186/1471-2288-14-54 ◽

2014 ◽

Vol 14 (1) ◽

Cited By ~ 6

Author(s):

Giuliane J Lajos ◽

◽

Samira M Haddad ◽

Ricardo P Tedesco ◽

Renato Passini Jr ◽

...

Keyword(s):

Preterm Birth ◽

Multicenter Study ◽

Correlation Coefficients ◽

Intracluster Correlation ◽

Practical Implications

Download Full-text

Standardising the measurement of physical activity in people receiving haemodialysis: considerations for research and practice

BMC Nephrology ◽

10.1186/s12882-019-1634-1 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 2

Author(s):

Hannah M. L. Young ◽

Mark W. Orme ◽

Yan Song ◽

Maurice Dungey ◽

James O. Burton ◽

...

Keyword(s):

Physical Activity ◽

Sample Size ◽

Repeated Measures ◽

A Priori ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Wear Time ◽

Minimum Number ◽

The Uk

Abstract Background Physical activity (PA) is exceptionally low amongst the haemodialysis (HD) population, and physical inactivity is a powerful predictor of mortality, making it a prime focus for intervention. Objective measurement of PA using accelerometers is increasing, but standard reporting guidelines essential to effectively evaluate, compare and synthesise the effects of PA interventions are lacking. This study aims to (i) determine the measurement and processing guidance required to ensure representative PA data amongst a diverse HD population, and; (ii) to assess adherence to PA monitor wear amongst HD patients. Methods Clinically stable HD patients from the UK and China wore a SenseWear Armband accelerometer for 7 days. Step count between days (HD, Weekday, Weekend) were compared using repeated measures ANCOVA. Intraclass correlation coefficients (ICCs) determined reliability (≥0.80 acceptable). Spearman-Brown prophecy formula, in conjunction with a priori ≥ 80% sample size retention, identified the minimum number of days required for representative PA data. Results Seventy-seven patients (64% men, mean ± SD age 56 ± 14 years, median (interquartile range) time on HD 40 (19–72) months, 40% Chinese, 60% British) participated. Participants took fewer steps on HD days compared with non-HD weekdays and weekend days (3402 [95% CI 2665–4140], 4914 [95% CI 3940–5887], 4633 [95% CI 3558–5707] steps/day, respectively, p < 0.001). PA on HD days were less variable than non-HD days, (ICC 0.723–0.839 versus 0.559–0.611) with ≥ 1 HD day and ≥ 3 non-HD days required to provide representative data. Using these criteria, the most stringent wear-time retaining ≥ 80% of the sample was ≥7 h. Conclusions At group level, a wear-time of ≥7 h on ≥1HD day and ≥ 3 non-HD days is required to provide reliable PA data whilst retaining an acceptable sample size. PA is low across both HD and non- HD days and future research should focus on interventions designed to increase physical activity in both the intra and interdialytic period.

Download Full-text

Randomizing patients by family practice: sample size estimation, intracluster correlation and data analysis

Family Practice ◽

10.1093/fampra/20.1.77 ◽

2003 ◽

Vol 20 (1) ◽

pp. 77-82 ◽

Cited By ~ 38

Author(s):

Roxanne H Cosby ◽

Michelle Howard ◽

Janusz Kaczorowski ◽

Andrew R Willan ◽

John W Sellors

Keyword(s):

Data Analysis ◽

Sample Size ◽

Family Practice ◽

Sample Size Estimation ◽

Size Estimation ◽

Intracluster Correlation

Download Full-text

Determinants of the intracluster correlation coefficient in cluster randomized trials: the case of implementation research

Clinical Trials ◽

10.1191/1740774505cn071oa ◽

2005 ◽

Vol 2 (2) ◽

pp. 99-107 ◽

Cited By ~ 137

Author(s):

Marion K Campbell ◽

Peter M Fayers ◽

Jeremy M Grimshaw

Keyword(s):

Correlation Coefficient ◽

Implementation Research ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Intracluster Correlation ◽

Intracluster Correlation Coefficient ◽

Cluster Randomized

Download Full-text

Weighted Maximum Likelihood Correlation Coefficient to Handle Missing Values and Outliers in Data Set

WSEAS TRANSACTIONS ON MATHEMATICS ◽

10.37394/23206.2021.20.43 ◽

2021 ◽

Vol 20 ◽

pp. 415-430

Author(s):

Juthaphorn Sinsomboonthong ◽

Saichon Sinsomboonthong

Keyword(s):

Missing Data ◽

Maximum Likelihood ◽

Sample Size ◽

Correlation Coefficient ◽

Missing Values ◽

Mean Squared Error ◽

Correlation Coefficients ◽

Median Percentage ◽

Data Set ◽

Median Correlation

The proposed estimator, namely weighted maximum likelihood (WML) correlation coefficient, for measuring the relationship between two variables to concern about missing values and outliers in the dataset is presented. This estimator is proven by applying the conditional probability function to take care of some missing values and pay more attention to values near the center. However, outliers in the dataset are assigned a slight weight. These using techniques will give the robust proposed method when the preliminary assumptions are not met data analysis. To inspect about the quality of the proposed estimator, the six methods—WML, Pearson, median, percentage bend, biweight mid, and composite correlation coefficients—are compared the properties in two criteria, i.e. the bias and mean squared error, via the simulation study. The results of generated data are illustrated that the WML estimator seems to have the best performance to withstand the missing values and outliers in dataset, especially for the tiny sample size and large percentage of outliers regardless of missing data levels. However, for the massive sample size, the median correlation coefficient seems to have the good estimator when linear relationship levels between two variables are approximately over 0.4 irrespective of outliers and missing data levels

Download Full-text